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Complex systems may often be characterized by their hierarchical dynamics. In this 
paper do we present a method and an operational algorithm that automatically infer 
this property in a broad range of systems; discrete stochastic processes. The main idea 
is to systematically explore the set of projections from the state space of a process to 
smaller state spaces, and to determine which of the projections that impose Markovian 
dynamics on the coarser level. These projections, which we call Markov projections, 
then constitute the hierarchical dynamics of the system. The algorithm operates on time 
series or other statistics, so a priori knowledge of the intrinsic workings of a system is 
not required in order to determine its hierarchical dynamics. We illustrate the method 
by applying it to two simple processes; a finite state automaton and an iterated map. 

Keywords: Hierarchical dynamics; Model reduction; Coarse graining. 



1. Introduction 

Modularity and hierarchical organization play an important role when determin- 
ing the character of a dynamical system. It could even be argued that hierarchical 
self-organization is a necessary condition for a system to display a high degree of 
complexity, see e.g. Simon [I]. Hierarchical dynamics is also a prerequisite for effi- 
cient model reduction. Then the general strategy when reducing the level of details 
in a model is to find a partition of the degrees of freedom (i.e. a projection of the 
phase space) which by itself form a system with Markovian dynamics; an observa- 
tion which is also discussed in detail by Shalizi and Moore ^2, . Conversely one can 
use the idea behind the time-delay embedding method for attractor reconstruction 
O d] to convince oneself that a dynamical system without the Markov property 
should be reconstructed in a higher dimensional phase space in order to make sense 
as a causal model. 

In a physical system, modularity is usually associated with separations in time 
and length scales. In this paper do we focus on hierarchical decomposition of stochas- 
tic processes that, in general, are systems without direct physical interpretation. 
Despite the void of guidance from physical intuition, there are several methods that 
can be used for determining their hierarchical structure. For example, if one has full 
access to the inner workings of a process' dynamics, i.e. the generative semi-group, 
Krohn-Rhodes theory can be used to decompose the semi-group as a hierarchical 
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Wreath product of finite groups and finite aperiodic semi-groups [5]. Here do we 
present a metliod for hierarchical decomposition and reconstruction that instead 
operates on the sequence of states, i.e. a time series, that is generated by the pro- 
cess at hand. In this way, prior knowledge of the process' intrinsic dynamics is not 
necessary. 

1.1. Historical background 

The line of ideas on decomposition of dynamical systems and identification of hier- 
archical dynamics can be traced back to the analysis of continuous symmetries in 
classical mechanics as advanced by Li^, Lagrange, Poisson, Jacobi and Noether. 
These reduction schemes result in the elimination of inactive, i.e. constant, degrees 
of freedom. In non-equilibrium statistical mechanics, dimensional reduction gener- 
ally means going from an effectively deterministic model to a Langevin type model 
that includes a noise term. The noise stems from fast, usually chaotic, motion that 
on the time scale of the relevant (slow) degrees of freedom can be approximated as 
white noise, resulting in a Markovian dynamics for the slow degrees of freedom. This 
idea was first formalized by Zwanzig [Sj and has later matured into the theory of 
adiabatic elimination, see e.g. [71[^. Yet another situation frequently encountered in 
models of natural systems is dissipative driven processes. A generic feature of such 
systems is that fast degrees of freedom, due to large negative Lyaponov exponents 
associated with the dissipation, often relaxes to a quasi-fixed point, i.e. a point in 
the phase space that appears effectively fixed on the time scale of the fast dynam- 
ics, but changes on the time scale set by the slow degrees of freedom. The overall 
dynamics is therefore in this case slaved to a slow positively invariant manifold and 
the resulting dimensionality is reduced. This picture has been advanced by Haken 
in his work on self-organization [9] . Lately this idea has also been revitalized in the 
turbulence community, primarily by a proof of existence of inertial manifolds in a 
class of hyperbolic dynamical systems |10j . In the same spirit, positive invariant 
manifolds are used in model reduction schemes in chemical kinetics [H]. Finally it 
is also worth mentioning that the connection between chaotic dynamical systems 
and non-equilibrium statistical mechanics has recently been further clarified by the 
work of Ruelle et. al. [El [13]. 

1.2. The method 

We start with a discussion on the general problem of how to define a hierarchical 
organization in a dynamical system. From a constructive point of view, a hierarchical 
system should be composed of interacting modules that contain, in some sense, 
smaller interacting modules. The process could be repeated to recursively generate 
new hierarchical levels as illustrated in Figure[TJ Conversely, a system is said to have 

''Strictly speaking, Lie's work is not limited to classical mechanics and can be applied to any 
differential equation. 
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(a) (b) (c) 

Fig. 1. Three modes of a dynamical system. A schematic illustration of how recursive modularity 
builds up hierarchies. There are three interacting modules in (c) that each contains interacting 
modules from (b) that each in turn contains interacting modules from (a). Given the dynamics of 
the modules in (a), how can we derive the existence and dynamics of the larger modules in (b) 
and (c) ? 
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Fig. 2. A transition st i— > st+i = T^{st) is decomposed into a transition in the Wreath product 
{X,Tx)l{y,Ty)l(Z,Tz)- The state {zt,yt,xt) is mapped to {zt+i,yt+i,xt+i) = {zffz(yt,xt),yf 
fy{xt),xt ■ fx) by the transformation {fz, fy, fx), where fz ■ y x ^ Z, fy : X ^ y and 
fx G Tx, and where dot denotes group operation. See |14) for further details. We may project 
away zt as xt and yt evolves independently of zt, and we may project away yt together with zt 
since they are slaved by xt- We may not, on the other hand, project away yt by itself since it 
influences the evolution of zt- 



hierarchical structure if it can be deconstructed through recursive decomposition of 
modular components. 

It is natural to invoke an operation that, in a categorical sense, carries the 
structure of a product: the Wreath product, see Figure [21 Associated with the 
Wreath product, there exists a quotient operator, or a projection, which in essence 
defines divisibility of the system. The projection is in general a map from the full 
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Fig. 3. An original state space S is reduced to a coarser state space A. The map 11 is a reduction. 
Te denotes the original update dynamics and is the dynamics induced on ,4 by 11. If is a 
Markovian dynamics, the diagram commutes and we say that 11 is a Markov projection. 
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Dependence 
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Fig. 4. (a) Direct product. Both automata can be projected away as they act independently of 
each other, (b) Semidirect product. Only the left automaton can be projected away since it is 
dependent on, i.e. is slaved by, the right automaton. 



state space (phase space in the continuous case) to a smalleio state spaceO The 
dynamics of the original system induces a dynamics on the projected state space. 
In principle any reducing map can be used as a hypothetical projection. However, 
crucial properties of the original dynamics are often lost in an arbitrary reduction. 
The most important such property is the Markov propertj0. Only very special 
reductions, which we call Markov projections, do respect the fundamental character 
of the dynamics. When this happens, the diagram in Figure [3] commutes. We may 
say that the dynamics is divisible by the quotient used by the projection, e.g. in 
Figure HI For a continuous deterministic dynamical system we call a projection 



''Lower dimensionality in the continuous case and lower cardinality in the finite discrete case. 
'^Normally a projection is also required to be idempotent, i.e. fulfill = P. In our case however, 
the projection maps between two different spaces and idempotency is not well defined. 
'^Note that determinism is a special case of the Markov property. 
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that respects the dynamics fiber preserving, although in this paper we focus on 
dynamics generated by discrete stochastic processes. The idea behind the method 
is to systematicahy test different reductions to see if they result in a Markovian 
dynamics on the reduced state space. If a reduction passes the test we conclude 
that it is a proper projection of the state space, i.e. a Markov projection. 

The method introduced in this paper is inspired by computational mechanics 
[TSl [T6] that derives optimal predictors of stochastic processes!! The predictors, 
termed e-machines, are automata whose nodes are equivalence classes, termed causal 
states, of observed histories of states (i.e. semi-infinite sequences in the context of 
stochastic processes) such that two histories are equivalent if they condition the 
same probability distribution of future observed states. Causal states are connected 
with transitions that are labeled with the current state of the observed process, 
which completes the e-machine. An e-machine is the minimal and maximally efficient 
model of the observed process [16] and may in practice be acquired approximately 
e.g. from generated time series [TBI. e-Machines are, in addition, Markov and one 
may use them to infer hierarchical dynamics in terms of causal states. 

The dynamics represented by an e-machine operates on the raw micro state space 
of the observed process. It is therefore only a reduction if the original system has 
less active states than the state space admits. Our method does, in contrast, infer 
coarse grained dynamics on different hierarchical levels. Since inferred dynamics 
in our case by definition exhibits the Markov property, its finite state automaton 
representation is the minimal one and equivalent, subject to converting nodes to 
transition^, to the e-machine of the same coarse grained dynamics. 

2. Markov projections 

We will now describe our method in more detail. Before continuing, the reader may 
want to consult the appendix for a brief review of the concepts that are central to 
our approach and to see our use of notation. 

2.1. General idea 

Say that you have a symbol sequence s that has been generated by a stochastic 
process over some state space S: 

s = (..., st-i, St, st+i, ...), e E. (1) 

You wish to determine if the process exhibits hierarchical dynamics according to 
the diagram in Figure [3] and, if so, the nature of this hierarchy. To do this one may 

"^The idea of optimal predictors has been introduced on many different occasions, within different 
contexts. See [17j for references and details. 

'For the reader inclined to compare our approach and computational mechanics in more detail, 
it may be useful — in order to avoid initial confusion — to note that the states of a stochastic 
process label the transitions of an e-machine, whereas they label the nodes in our automaton 
representation. 
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Fig. 5. The original symbol sequence (..., st, St+li ■■■) is projected onto a symbol sequence 

{..., st-i, St, st+i, ...). Find a projection EI^ such that the imposed dynamics has the Markov 
property. Then a state Sr only depends on its previous state 5^—1- 



use a general procedure in the following. Systematically examine the set of possible 
partitions of S. For each examined partition A: 

(1) Map s onto a new symbol sequence s and the corresponding process T4 with 
the projection 11^ : 'S ^ A, 

S = {...,St-l,St,St+l, ...), Sr^U-^{Sr), (2) 

as in Figure [5] 

(2) Test if the coarse grained sequence s is constituted by Markovian dynamics. 

The Markov property in step (2) may be identified in the realm of information 
theory as we discuss next. 



2.2. Markov property measure 

Let a-i, i = 1,2, \A\, be specific elements in a partition A (not to be confused with 
St, that are variables over elements in A at certain times t). For each state a^, let 
Xi be a stochastic variable of the past states preceding Ui, and let Yi be a stochastic 
variable of the subsequent state of . If Xi and Yi are independent for all Ui € A, T4 
is a Markov process. There are a number of different ways to quantify the degree 
by which, or probability that, two distributions are independent. One common 
method is the test [19 . Its main purpose, however, is to provide the significance 
of association between two variables, rather than the strength of association that we 
prefer. Although there indeed are measures of the strength of association based on 
statistics, e.g. Cramer's V and the contingency coefficient C [12], these are ill-suited 
for our purposes as the former exhibits discontinuities with varying contingency 
table size, and as the latter requires tables with an equal number of rows as columns. 
In addition do the measures lack direct interpretations. Instead do we use the mutual 
information l{Xi;Yi) between Xi and Yi as a measure of dependence with respect 
to state ai. This measure grants clear-cut interpretations and, naturally, enables us 
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to use tools from information theory. The mutual information is defined as 

l{Xf,Y,) = H(X,) + H(y,) - H(X„ K,), (3) 
where H(y) is the Shannon entropy 

ii{V)^~Y.^{V^v)\og,P{V^v) (4) 

vev 

of a stochastic variable V drawn from V, and where H([/, V) is the analogous en- 
tropy of the joint distribution of two stochastic variables U and V. I{Xi; Y^) is the 
information one gains when replacing the separate distributions T'(Xi) and P(i^i) 
with their joint distribution P{Xi, Yi). l{Xi; Yi) > with equality when Xi and Yi 
are independent [20j . As a Markov property measure of a partition as a whole, we 
employ the expected mutual information 

5] P(a,Ol(X,;FO, (5) 

where P(ai) is probability of state a^. Similarly, we use the shorthand 
P(..., Sf_i, St, St+i) to denote the probability that a sequence of stochastic vari- 
ables {..., St-i, St, St+i) has the outcome (..., S(_i, St, St+i)- Using the definition for 
conditional probabilities and explicitly representing past and futures as substrings, 
we can rewrite Eq. [5] to 

(I) = - E P(st,St+i)log2P(St,St+i) + ^P(st)log2P(St) 

st,st+l St 

- ^ P(...,S(_l,St)l0g2P(...,St_l,St) 

...,st-i,st 

+ ^ P(...,St_i,St,St+i)l0g2P(...,St_l,St,St+l) 

■ ■■^st-i,St.St+i 

= Ai/2-Ai7oo, (6) 

where Ai/„ is the slope of the block entropy of S at length n 21]. The expected 
mutual information (I) between past symbols and the next symbol is in other words 
equivalent to the difference in expected uncertainty of a symbol conditioned on 
the preceding symbol and the expected uncertainty of a symbol conditioned on all 
preceding symbols. (I) = if one expects no reduction in the uncertainty of the 
current state from looking further back than one state. 

In practice one acquires approximations of Xi and Yi from a finite symbol se- 
quence and finite history lengths. For each symbol a; e A, we set up a contingency 
table whose rows are values of Xi of occuring histories (st_„, St-2, st-i) of length 
n (drawn from A^) that precede Ui, and whose columns are values of Yi of possible 
subsequent symbols St (drawn from A) of a^. Element (j, k) in the table is then the 
count that history j (according to some indexing) is followed by and then a^. 
Eq. [6] generalizes to finite history lengths: 

(I„) = AH2 - AHn+2, (7) 
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Fig. 6. Example process Te over the state space E = {cri , (T2, crs, 0-4}. Edges are labeled with 
transition probabilities. 
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Fig. 7. Dynamics (a) Tg, (b) Tj; and (c) T-p resulting from Markov projections of the example 
process Tj; in Figure |6] 
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Fig. 8. Dynamics hierarchy of Tj; for p = 1/2 with simplified representation to the right. The 
dashed arrow implies that A may be directly projected onto D. The arrow is redundant though, 
since projections may always be composed, e. g., II^ = Ilg o 11^ in this case. 



where (In) is the expected mutual information when histories of length n are con- 
sidered. 



3. Examples 

Before moving on to algorithmic details, we exemplify our method by employing it 
to two simple stochastic processes; a finite state automaton and an iterated map. 



3.1. A simple automaton 

Consider the automaton Ts over the state space E = {cti, (J2, (73, u^} in Figure[6l All 
possible 15 partitions are evaluated. Forp = 1/2, there are four Markov projections, 
including those corresponding to the trivial partitions A = {{ai}, {(T2}, {""s}; {"'4}} 
and T> = {di}, where di = S. The third process, Tg, is a bit- flip process over 
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Fig. 9. Expected mutual information in bits with respect to the 15 projections (of which two 
examples are labeled) of the process Ts in Figure [6] as functions of the transition probability p. 
Note that (I2) of the Markov partitions is not visible since it is of the order of IQ-^ bits. The 
statistics was collected from 1000 symbol sequences, each of length 1000. 

{^1,^2}, where bi — {ai,a3} and 62 = {o'2,f4}, i.e. repetition of {bi,b2) blocks. 
The fourth projection gives C — {ci, C2}, where ci = {cri, (T4} and C2 — {<J2, 0-3 }■ Tc 
is such that ci and C2 are generated with probabihty 1 — p and p respectively, and 
a ci is always followed by a C2. For p — 1/2, this process is referred to as the golden 
mean process [22j . See Figure [71 The Markov projections are related according to 
the hierarchy in Figure [H where the original process Ts is the direct product of 
Tg and Tc; — x Tc- The number of Markov projections is dependent on the 
transition probability p. Figure [9] shows (I2) for the separate possible partitions as 
functions of p. Specifically, at p — and p = 1, more than four Markov projections 
exist (e.g. {{cti, cr4}, {(T2}, {cs}} and {{cti, (72}, {(73, cr4}}) due to the elimination of 
transitions in T^. 

3.2. An iterated map 

The second example concerns a process that belongs to a well-studied class of dy- 
namical systems; iterated maps. They are discrete in time, but operate on contin- 
uous phase spaces. Our process, the Roof map, is defined over [0, 1] according to 



where a is a parameter. We iterate the map and discretize the trajectory with four 
equally large bins. That is, Xi ai if x,; e [0, l/4[, Xi 1-^ <T2 if Xi e [1/4, l/2[. 
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Fig. 10. Roof map for a = 1/2. 




(a) (b) 
Fig. 11. Markovian dynamics (a) and (b) Tg of discretized Roof map for a = 1/2. 

Xi I— > (73 if Xi G [1/2, 3/4 [ and Xi 0-4 otherwise. For a ~ 1/2, Figure [TUl there are 
five Markov projections that correspond to the partitions 

A={{(Ji},{<72},{<J3}A^i}}, 
B = {{(71,(72}, {0-3}, {CT4}}, 
C = {{(71, (72}, {(73, (74}}, 

V = {{(71, (72, (73}, {(74}} and 

f = {{(7l,Cr2,(73,(74}}. 

T4 and Tj3 are shown in Figure 111! whereas Tq and both are isomorphic to 
the automaton in Figure [7jb) with p = 1/2. The Markov projections are related 
according to the hierarchy in Figure fl^ b). Note that £ is the one-element partition 
that trivially implies Markovian dynamics. For a = 1/4 and = 3/4, Figure fl^ a) 
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Fig. 12. Dynamics hierarchies of discretized Roof map for (a) a 
a = 3/4. Dashed arrows for composed projections are omitted. 



1/4, (b) a = 1/2 and (c) 



and fT2rc) respectively, the Markovian dynamics of finest resolution is not provided 
by A. Instead, the partitions 

^ = {Wi},W2,cr3,cr4}}, 
for a = 1/4, and 2?, for a — 3/4, are required, where Tyr and are isomorphic. 

4. Algorithm 

Although an exhaustive search through the set of all possible partitions works fine 
for small it is an impractical strategy for larger state spaces due to a combinato- 
rial explosion. A state space with n states allows for _B„ = Ylk=i '^'('^i k) partitions, 
where i?„ is called the Bell number and S{n, k) is the Stirling number. The latter is 
the number of ways to partition a set with cardinality n into k nonempty subsets. 
Both Bn and S{n^ k) are large numbers (assuming 1 < k < n — \), 

'3 = 1 ^-^^ 

which calls for an approach other than sheer brute force. 
4.1. Recursive partitions 

There are some basic relations between partitions that allow us to avoid an exhaus- 
tive search of all possible partitions. When designing our algorithm, we exploit that 
the mutual information of a partition A, with respect to a state a^, is larger or equal 
to the mutual information of a partition B 3 ai oi A, with respect to a^. To see this 
inequality, consider the following. Let 11 be a projection from A to B; Xi and Xi 
stochastic variables of the past states preceding with respect to A and B; and Yi 
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and Yi stochastic variables of the subsequent states of with respect to A and B. 
Since Yi is a function whose range is A, it may be composed with 11. Then 

n(y.) = %. (9) 

Let $ be a function that maps semi-infinite sequences of states from A onto semi- 
infinite sequences of states from 

$(..., st_3,st_2,st-i) = (...,n(st_3),n(st_2),n(si_i)). (10) 

Then 

<^{Xi) = X,. (11) 

Since a function of a stochastic variable V cannot increase the information about 
another stochastic variable C/, 1{U ; V) > 1{U; g{V)) [20 (p. 35), we see that 

i{x^; ro > n(K,)) > i($(x,); n(r,)) = K,), (12) 

as I is symmetric. The inequality (|12p is helpful since if we know that an element bt 
in a partition B results in high mutual information, we may discard all partitions 
A3 bi that projects onto B. This leads us to an algorithm that evaluates partitions 
in ascending cardinal order, i.e. from coarser to finer partitions. 

4.2. Procedure 

The components of the algorithm are the sets Sp (previous partition elements) and 
iSc (current partition elements), and the integer I (level). Left arrow (<— ) denotes 
assignment. 

(1) Initiation: Sp ^ 2^ (the power set of S), 5c <— and I ^ 2. 

(2) For every partition of size I composed from elements in Sp 

(a) Evaluate (I„) and store Markov projections. 

(b) Add elements of size < |S| — I with low mutual information I„ to Se- 
lf no partitions can be composed or if Z = stop. 

(3) Sp <- 5c, 5c ^ and / ^ / + 1. Go to step (2) 

Partitions of size I that are evaluated are thus those that can be composed from 
elements that have implied low mutual information at level I — 1. In such a way, 
partition elements that result in high mutual information are successively discarded 
as these may not improve due to Eq. 1121 

4.3. Possible further pruning 

The algorithm may indeed be subject to improvements. If we assume that the 
dynamics is ergodic then it follows that the cardinality of the process as a whole, 
i.e. has the cardinality of its components as divisors. In other words, if we have 
reason to assume that the time series that we observe includes all possible states 
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in the state space (in essence a weak ergodicity assumption), then we only need 
to try partitions with cardinahty that divides The reason for this is straight 
forward. Assume that we combine two processes Ta and Tg with transition matrices 
Qa (n X n) and Qb {rn x m) respectively. If the two processes are combined in a 
trivial way, i.e. there is no interaction between the sub-processes Ta and Tb, then 
the total process has the transition matrix Q = Qa Qb, a, n ■ m x n ■ m matrix. 
Our algorithm works from the other end. We are given a sequence of states from 
which we can estimate the total transition matrix. Then it is clear that the number 
of states in an independent sub-process must be a devisor of the number of states 
in the process as a whole. 

More generally, two processes can be combined in a more non-trivial fashion 
where for example Ta is slaved by Tb, i.e. the dynamics of Ta is affected by the 
state of Tb but not vice versa (e.g. as in Figure[Hb)). The resulting process is then 
described by a semi-direct product or a Wreath product. We will not go into the 
details of this algebraic structure here but refer the reader to any introductory text 
on group theory, for example |23| . The overall conclusion above is however still valid; 
the number of stats in the process Tb must be a divisor of the number of states in 
the combined process. We can still decrease the number of tested partitions, again, 
under the assumption that we have global coverage of the state space. 

5. Discussion 

We have presented a method for inferring hierarchical dynamics from observed time 
series. Alternatively we may say that the presented scheme detects components of 
a process that in themselves have Markovian dynamics. The possible usefulness 
of this, as well as other related methods for decomposing and reducing dynamical 
systems, is great. Essentially all related methods are, however, either tailored for a 
limited class of systems, or suffers from high computational complexity. Examples 
of methods that exhibit the latter are Krohn-Rodes theory, calculations of invariant 
manifolds, and Markov partitions [24l. Our method is no exception though. The 
added structure introduced in Section |4] does indeed reduce the number of potential 
partitions of the state space, but it is not enough to remedy a computational cost 
that in the worst case scales exponentially with the cardinality of the state space. 
We believe that this problem is generic to any reduction scheme. In practice one 
must hope that the process under analysis carries some additional structure that 
allows us to make further assumptions about which type of projections that make 
sense to test. For example, if the system is a cellular automaton, it is clear that 
the reduction should be faithful to the locality and translational invariance of the 
update rule, i.e. only projections acting locally and independent of the position 
on the underlying lattice should be evaluated. The reduction of possible partitions 
from such considerations often recasts the problem into the computationally feasible 
domain. Note that the tailoring needed is limited to the generation of test partitions, 
the rest of the algorithm remains unchanged. This feature is appealing from the 
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implementation standpoint. 

We end this presentation by briefly mentioning the types of systems for which 
we hope that the method is useful. These may be spin systems (rcnormalization), 
lattice gases (automated detection of hydrodynamics variables), pattern forming 
cellular automata (it is speculated, and partially known, that these systems have 
hierarchies of descriptions), and interaction networks (identification of functional 
groups). 
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Appendix A. Preliminaries 
A.l. Stochastic processes 

Here we consider dynamical systems in the form of discrete stochastic processes. 
They generate bi- infinite sequences S of stochastic variables St, 

5 = (..., (A.l) 

where each Sr is drawn from a state space E = {cti, (T2, cr„}. The subscript t 
denotes the present state, whereas t + i with negative and positive indices i denote 
past and future states, respectively. We always assume that a process that generates 
S is stationary, i.e. that it is invariant under time translation: 

P{St+n — So, St+n+1 = Si, St+n+l ~ Si) = (A. 2) 

P{St+m — Sq, St+m+l — Si, St+m+l = Si), (A-3) 

for all n, m G Z, ? G N and Sk G S. 
A. 2. Markov processes 

A process is said to have the Markov property if future states are conditionally 
independent of past states, given the current state: 

PiSt+n = Sol-.., St+n-3 = S3, St+n-2 = S2, Sf+n-l = Si) = (A. 4) 

P{St+n = So\St+„-l = Si), (A.5) 

for all n e Z and G S. Processes with this property are termed Markov processes. 
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A. 3. Partitions and projections 

By a partition A of the state space S, we refer to a set of disjoint subsets whose 
union is S: 

A = {ai,a2,...,an}, (A.6) 

where ljr=i ^* ~ ^ ^'^'^ H = for all i ^ j. A projection 

I1^:T.^A (A.7) 

is a function that maps elements of S onto their respective elements in A. 

Further, one may recursively partition a partition. A partition B of another 
partition ^ is a set of unions of disjoint subsets of A, where the union of the 
elements of A and the union of the elements of B are equal. That is, 

S = {61,62,.. .,6™}, (A.8) 

where h =[J ai, ai € A; biCi bj — for all i ^ j, and lj"=i '^i — UjLi — ^ ■ The 
projection from ^ to S is analog to (|A.7|) . 

A. 4. Example 

We conclude the preliminaries with a simple example. Consider the process Ts 
given by the finite state automaton in Figure. [51 Ts acts on the alphabet S = 
{ai, (T2, (13, (74}, is stationary, and fulfills the Markov property since the transition 
probabilities from each state are independent of previously visited states. There 
are 15 different possible partitions of E; for example A = {{ci}, {(T2, 0-3}, {(T4}}, 
B = {{cri},{cr2,cr3,o-4}}, C = {{o-i}, {0-2}, {0-3}, {0-4}} and T> = {S}, where B, 
e. g., is a partition of A. The projection H^, for instance, gives n^((Ti) = {cti}, 
11^(^2) = n^(CT3) = {(72,(73} and n^((74) = {(74}. 



